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Abstract 

We previously developed an algorithm, called "resolution exchange" , 
which improves canonical sampling of atomic resolution models by 
swapping conformations between high- and low- resolution simulations. 1 
Here, we demonstrate a generally applicable incremental coarsening 
procedure and apply the algorithm to a larger peptide, met-enkephalin. 
In addition, we demonstrate a combination of resolution and tempera- 
ture exchange, in which the coarser simulations are also at elevated 
temperatures. Both simulations are implemented in a "top-down" 
mode, to allow efficient allocation of CPU time among the different 
replicas. 

Atomic resolution simulations of proteins are currently limited to short 
durations (less than one //sec) 2 or small systems (less than 100 residues). 3 ' 4 
Furthermore, accurate calculations involving large conformational changes 
are not possible for any system, as the cost of calculating entropic contri- 
butions is too great. Indeed, the cost of such calculations is only going 
to increase, as empirical force fields are improved by including polarization 
effects, either in a classical way 5 ' 6 or in a semiclassical way. 

Thoroughly sampling the space of conformations is essential for a num- 
ber of problems. From a purely biological perspective, there is a grow- 
ing awareness of the importance of protein fluctuations — over and above 
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the static picture — in the function of most proteins. Allostery and con- 
formational changes dramatic enough to be captured experimentally are 
just two examples of the existence of such fluctuations. 9 ' 10 In a computa- 
tional context, careful validation of empirical forcefields requires confidence 
in the quality of conformational sampling, so that error may be attributed 
to the forcefield rather than undersampling. The calculation of free en- 
ergy differences — as required for evaluation of binding affinities of small 
molecules, 11 or the strength of protein-protein interactions 12 — also requires 
reliable sampling. 13,14 

The undersampling (or "quasi-ergodicity" ) problem is widely recognized, 
and consequently there have been many attempts to improve upon standard 
simulation protocols. Methods which aim to generate a canonical distri- 
bution of conformations include multiple time-step methods, 15 ' 16 nonlinear 
variable transformations, 17 J-walking, 18 inverse renormalization group ap- 
proaches, 19 and adaptive resolution methods. 20 The most widely used class 
of methods, however, comprises the generalized ensemble approaches. 21-23 
Of the generalized ensemble approaches, perhaps the simplest and most 
popular is parallel tempering, in which a number of copies of the system 
are evolved in parallel at different temperatures. 24-27 Occasionally, configu- 
rations are swapped between neighboring replicas, presumably allowing the 
low temperature replica to access more configuration space via high temper- 
ature conformations. 

Numerous, 28-33 as well as extensive 3 ' 35 parallel tempering simulations 
have been published, including some which claim to demonstrate the su- 
perior efficiency of the method over standard molecular dynamics (MD) 
simulation. 36-38 Regardless of the validity of those claims, there appears 
to be a limit to the utility of parallel tempering for the study of large pro- 
teins, nucleic acids, and macromolecular complexes: the number of replicas 
required to bridge a specified temperature gap increases as the square root 
of the number of degrees of freedom of the solution. 39 ' 40 In other words, if 
atomic resolution information is desired, then very many atomic resolution 
simulations are required. Recent work by Berne and coworkers partly ad- 
dresses this problem for explicitly solvated systems, so that the number of 
replicas scales with the number of degrees of freedom of the solute only. 41 
Solutes like proteins, of course, can be quite large. 

In this paper, we address the problem of insufficient sampling of implic- 
itly solvated biomolecules using a different approach. We recently developed 
an algorithm, called resolution exchange, which uses a distribution generated 
by a coarse-grained model to improve the sampling of a higher-resolution 
simulation. 1 The resolution exchange (ResEx) algorithm guarantees canon- 
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ical sampling for each level of resolution, so that the coarse-grained simula- 
tion introduces no bias into the high-resolution simulation. The algorithm 
is similiar in spirit to other exchange simulations, in that conformations are 
swapped between otherwise independent simulations. However, by employ- 
ing replicas of reduced resolution, ResEx has the potential for significant 
efficiency gains. Other authors have recognized the value of improving sam- 
pling with reduced resolution representations. For example, Iftimie et. al. 
used a classical potential as an importance function to improve sampling 
of an ab initio potential. 42 Here, our goal is to sample a classical atomic 
resolution potential, which leads us to a different algorithm. Also, Liu and 
Sabatti formally introduced a Markov chain Monte Carlo method which al- 
lows jumps between spaces of different dimensions. 43 Their algorithm has 
apparently not been applied to the simulation of macromolecules. Lwin 
and Luo recently introduced an algorithm similiar to ours, but it does not 
generate canonical sampling. 44 

We have also employed a modification of the usual parallel protocol used 
to carry out exchange simulations, 1 generalizing the "J-walking" approach 
previously introduced by Frantz et. al. ls The J-walking (or as we call it, 
"top-down" exchange) method allows an unequal distribution of CPU time 
among the various replicas. We emphasize that any exchange simulation 
may be run in top-down mode. In contrast with other exchange meth- 
ods, top-down exchange allows very little simulation time to be spent on 
the computationally expensive, high-resolution (or low temperature) repli- 
cas. Substantial improvement in sampling efficiency is therefore possible, in 
principle. 

We previously applied the resolution exchange algorithm to butane and 
dileucine peptide. 1 Here, we confront issues which arise in the study of larger 
molecules. We show that a molecule may be coarsened incrementally, so 
that the overlap between models of neighboring resolution may be adjusted 
for improved sampling efficiency. We also demonstrate that resolution and 
temperature exchange are easily combined in a single simulation, so that 
sampling may be improved by both increasing temperature and decreasing 
resolution simultaneously. The incremental coarsening procedure is first 
demonstrated on dileucine, where we check that the correct conformational 
distribution is attained. We then demonstrate successful exchange between 
an all-atom model and an united-atom model of met-enkephalin, using two 
different exchange ladders: a ladder of varying resolution only, and a ladder 
which combines resolution and temperature changes. We will finish with a 
discussion of the next logical steps toward larger peptides and proteins. 
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1 Theory and methods 



The results presented in this paper concern two distinct, recently introduced 
simulation methods. 1 The first is resolution exchange, which allows ex- 
change between simulations at different resolutions, and preserves canonical 
sampling. The second is top-down exchange, which allows unequal distribu- 
tion of CPU time, maximizing the efficiency of an exchange simulation. In 
addition, we describe a general incremental coarsening strategy for building 
a ladder of models which improves exchange efficiency. 

1.1 Resolution exchange 

Resolution exchange (ResEx) is motivated by the effectiveness of coarse- 
grained models for sampling of protein conformations, 45 ' 46 and by the need 
for atomic- level resolution for many calculations. Res-ex uses coarse-grained 
simulation to accelerate basin- hopping in more detailed models. In contrast 
with ad hoc methods, ResEx guarantees canonical sampling of the atomic- 
resolution model. 

The basic idea, as in any exchange simulation, is to exchange confor- 
mations between two simulations. How are trial configurations constructed 
for an exchange between models with different numbers of degrees of free- 
dom? Consider a pair of models: a coarse-grained model, a configuration 
of which is described by a set of coordinates 3>, and an atomic resolution 
model described by a larger set, {<&,x}. Note that the coarse model is 
built from a subset of the coordinates of the detailed model. Before the 
exchange, let the coarse-grained configuration be <& a , and let the atomic- 
resolution coordinates be {<&b,Xfe}. By swapping only coarse variables, the 
trial configuration for the coarse-grained model is simply <!>;,, and for the 
atomic-resolution model is {& a ,x.t>}. 

The exchange criterion is derived by considering the two simulations 
to consitute a single system characterized by the combined coordinates 
{<l> a , ($6, Xfe)}. Because the simulations — aside from the exchanges — run 
independently, the probability distribution of the combined system is the 
simple product of the individual distributions. Let the potential functions 
of the high- and low-resolution simulations be Uh(<&, x) and Ul(&) respec- 
tively, and denote the Boltzmann factors as7Ttf($,x;/3 H ) = e -^ u ^^/Z H 
and itl(&;Pl) = b~^ lUl ^ /Zl, where Zh and Zl are the partition func- 
tions. Then the exchange attempt is accepted with the Metropolis rate: 
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min 1 



' Tr H (& b ,x b ;f3 H ) vr L ($ a ;/? L ) 



(1) 



The dependence on inverse temperature {(5) is made explicit, as a reminder 
that the method is naturally combined with temperature exchange, though 
this of course extends to any type exchange, such as Hamiltonian exchange. 47 
Note that in the case of ordinary (temeprature based) replica exchange, 
in which all the coordinates are swapped, Eq. Q reduces to the familiar 
expression min[l, exp(— A/3AU)]. 

In a parallel implementation, Eq. together with the protocol for trial 
move construction, ensures that the algorithm satisifies the detailed balance 
condition. To see this, consider "old" (o) and trial/ "new" (n) configurations 
of the combined system. In the construction of any Boltzmann-preserving 
Monte Carlo move, two transition probabilites must be considered: the con- 
ditional probability a(o — > n) of generating the move from configuration o 
to n, and the conditional probability w(o — > n) of accepting the move. 48 De- 
tailed balance insists that p(o)a{o — > n)w(o — > n) = p(n)a(n — > o)w{n — > o), 
where is the equilibrium probability of configuration j. Yet the accep- 
tance criterion Jj) has the form 



implying that the generating probabilities a are identical. This is indeed the 
case: given a pre-defined division into coarse and detailed coordinates, the 
conditional probability for the move o = {& a , (*&&, xj,)} — > n = {$6, (* a j x fe)}i 
and its inverse, are both one. That is, given the old configuration of the 
combined system, there is a unique trial configuration. 

Lwin and Luo have constructed a similiar algorithm, except that before 
checking acceptance via Eq. (|T|). the high-resolution trial configuration is 
minimized. 44 Such minimization (even subject to constraints on the coarse 
coordinates 3>, as in ref. 44 ) violates the detailed balance condition by bi- 
asing the generating probability, a(o — > n), without any compensating cor- 
rection in the acceptance criterion. Put more simply, reverse moves into 
un-minimized configurations are impossible. The consequences of the viola- 
tion are readily seen, as shown in Sec. 12.11 

1.2 Incremental Coarsening 

An important practical issue is raised, however, by the construction of trial 
moves without minimization. The problem is that the degrees of freedom in 






w{n — > o) p (o) 
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the high resolution simulation {<l?,x} are strongly coupled — for a protein, 
think of 3? as backbone degrees of freedom (DoF) and x as side-chain DoF. 
Then it is clear that construction of trial moves by our method may lead 
to high rejection rates. We have solved this problem by noting that the 
rejection rate depends on the both the number and type of DoF in the set 
{x}. Employing a ladder of incremental models at intermediate resolutions 
allows the acceptance rates to be tuned to reasonable values, as shown in 

Fig.m 

A ladder of incrementally coarsened models is straightforward to con- 
struct. Consider coarsening from an all-atom representation of a protein to 
a united-atom representation. In the first model above the all-atom level, 
only one residue is described by the united-atom representation — the protein 
is described by a "mixed model" , with one united-atom residue and the rest 
all-atom. Then, in the next level up, there are two united-atom residues, 
and so on, until the entire protein is described by the united-atom force 
field. A similiar procedure may then be used to go beyond the united-atom 
level to a united residue level. Notice that it may be desirable to coarsen 
more than one residue at a time, since some residues have fewer degrees of 
freedom than others. 

Of course, implementation of the incremental ladder just described re- 
quires the construction of a potential function which has both united- and 
all-atom groups. In this work, we have built this mixed potential by combin- 
ing the parameters for united and all-atom force fields into a single file. In 
other words, we created a larger parameter file, which contains both all-atom 
and united atom atom types. This file also includes all of the interactions for 
both united- and all-atom types, with the united-atom interactions modified 
as described in Sec. 11.11 Adding some parameters (taken from the all-atom 
potential) for the interfaces, where united and all-atom residues link, the 
mixed potential describes the whole molecule. The parameters (formatted 
for use in TINKER) are included as supplementary material. 

The incremental coarsening approach just described is rather general 
and not restricted to implementing exchange ladders spanning united- to 
all-atom resolutions. Lower resolution models could also be considered, for 
which it may be desirable to coarsen several residues at once. A first quan- 
titative analysis of the incremental coarsening procedure, suggesting how 
efficiency can be improved, is given in Sec. 12.21 and 12.31 
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1.3 Top-down exchange 

In many exchange simulations, whether they are temperature-based, 38 Hamiltonian- 
based, 47 or use some other extended ensemble, 49 ' 50 the goal is to improve 
the sampling of a hard-to-sample model (such as an all-atom protein model 
at native conditions) by sampling a related model, which is presumed easier- 
to-sample (such as the same all- atom model at higher temperature) 1 . In- 
formation is swapped between the simulations by occasionally exchanging 
configurations, in a way which preserves canonical sampling of each distribu- 
tion. Usually there is little overlap between the hard-to-sample (henceforth, 
"bottom level") and the easy-to-sample (henceforth, "top-level") models, 
and therefore a ladder of intermediate models is required. 

A critical observation is that the accuracy which is ultimately attained in 
the hard-to-sample, bottom-level model is effectively limited by that which 
is obtained in the easy-to-sample, top-level model. 18 In many cases, the top 
level is still quite difficult to sample well, and will require considerable CPU 
time to reach an acceptable accuracy — much more than it would usually be 
allotted in a parallel implementation. It is this observation which motivates 
the top-down method. The top-down algorithm shown schematically in Fig. 
|2]was developed previously for temperature-based simulation, 18 though was 
not widely recognized as such. The procedure is as follows: 

1 For a discussion of these issues from a statistical perspective, see Neal 51 
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(i) Run and store a simulation at the top level (model Mjy) until it is 
judged to be sufficiently converged. This trajectory is a sample of the 
distribution vr7v(r) of the top level, where N labels the level and r labels 
the configurations. In the case of ResEx, r = (<fr,x). Let n be a running 
index, with n = N at this top level. 

(ii) Start a simulation at the n — 1 level — for example, at the next lower 
temperature. Configurations r n _i will be sampled according to 7r n _i for 
model Mn_i. 

(iii) Whenever an exchange is to be attempted, pull a random trial configu- 
ration r n from the M n trajectory. In the case of ResEx, one requires only 
the subset 

(iv) Accept the trial configuration according to 

. [ vr n _i(r n ) 7r„(r n _i)" 

mm 1, — — , 

7r n _i(r n _i) 7r n (r n ) 

where vrj(r) = e^'^^/Zj, Z, is the partition function, U{ is the potential 
function, and is the inverse temperature. Notice the partition functions 
need not be known, as they cancel between numerator and denominator. 

(v) Continue with steps (iii) and (iv) until the sampling of the n — 1 level is 
judged sufficient. Store the n — 1 trajectory. 

(vi) Continue with steps (ii) to (v) for n = N — 2, N — 3, ... until the bottom 
level simulation is complete. 

First, note that canonical sampling is maintained by Eq. ()1.3|) . 18 just as 
in an ordinary parallel exchange simulation. On the other hand, detailed 
balance is not satisfied, as the trial configuration for the level n simulation 
(r n _i above) is discarded — making reverse moves effectively impossible. The 
error is one of practice, not of principle, arising from the fact that the samples 
of 7r n and 7r n _i are finite, just as in any simulation. 

To see intuitively that canonical sampling is maintained by top-down ex- 
change, imagine a pair of simulations undergoing ordinary parallel exchange. 
Unbeknownst to the investigator, however, the top level simulation is run- 
ning on a very fast processor, while the other is running on an old, slow 
processor. Between neighboring exchange points, the trial conformations 
from the fast processor will be far more decorrelated than those of the slow 
simulation — which mimics the effect of the top-down protocol. However, 
these exchanges still satisfy detailed balance. In the limit of an infinitely 
fast top-level simulation, trial configurations are completely decorrelated, 
and one could equally well choose randomly from 7r n as in step (iii). 

Second, notice that because trial configurations are pulled at random 
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from the sample of 7r n in step (iii), transitions which are slow in the actual 
M n trajectory occur rapidly among the trial configurations. Maximum ben- 
efit is thus obtained from successful exchanges — in contrast with a parallel 
exchange simulation, where trial configurations are typically separated by 
only a few picoseconds, and remain highly correlated. 

Third, good results may be obtained expending very little CPU time 
on all levels except the top one. This may be understood from an energy 
landscape perspective. The top level has been used to thoroughly sample 
the space — high barriers are crossed, and major sub-basins equilibrated. At 
lower levels, only local equilibration need occur. For example, let T n(m \ oc 
be the time to cross high barriers, T\ oca i be the time to equilibrate locally, 
and say that m successful exchanges are needed to sample the space well. 
Then r/ oca / x m CPU time is needed to sample the lower level. The required 
condition to save time over a parallel simulation is that 7] oca / << T non i oc . 
The degree to which this condition is satisfied will depend on the system 
studied, but the top-down approach allows the flexibility to take advantage 
of a separation in time scales. This idea is reminiscient of the "dragging" 
of fast degrees of freedom, suggested by Neal, 52 and the multiple time step 
approaches developed by Berne and coworkers. 15 ' 16 

Finally, a major advantage of top-down simulation over parallel exchange 
protocols is that exchange attempts are nearly "free", in the sense that no 
communication is required between processors. 18 This means that exchanges 
may be attempted very frequently, and therefore much lower exchange rates 
may be accomodated. In the case of temperature exchange, this allows either 
for the steps in the temperature ladder to be more widely spaced, or for the 
treatment of larger systems with fewer replicas. 

1.4 Simulation details 

In ideal circumstances, low-resolution models used in ResEx simulations 
would be specifically optimized for resolution exchange. They would have 
maximal conformational overlap for the common degrees of freedom. Here 
we work with an "off the shelf" low resolution model (united atom) which 
leads to some difficulties. Consider, for example, a C Q -C bond which is pa- 
rameterized in the two models by two slightly different natural bond lengths. 
In an exchange attempt, the configurations are swapped, and in each trial 
configuration the C a -C bond is moved a bit from its preferred length. These 
small contributions add up for every harmonic term in the entire molecule, 
and have a noticeable effect on the acceptance of exchange moves. We have 
solved this problem by simply changing the harmonic parameters of the 
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coarse model to match those of the detailed model. This makes the coarse 
model more "exchangeable" with the detailed model. Since the coarse model 
is simply "suggesting" configurations for the atomic model, and since Eq. 
Q guarantees that no bias is introduced by the coarse model, we need not 
worry that the coarse model parameters are changed from their original 
values. We now describe in detail the two molecular systems which were 
studied in the present work. 

Dileucine. We first studied dileucine peptide (ACE-(Leu)2-NME) using 
the same forcefield parameters as in Ref. 1 Here, we also carried out an incre- 
mentally coarsened ResEx simulation of dileucine in 5 levels. The coarsest 
level (M4) was the same as in, 1 namely a modified version of OPLSUA. 53 
In lower levels, the molecule was rendered in finer detail beginning at the 
N-terminus: in M3, the N-terminal methyl group, C a , and C 13 of the first 
residue were modelled in full atomistic detail; in M2, C 7 and both C 5 's of the 
first residue were additionally modelled explicitly; in M\, the C a , C 7 , 
and one C 5 of the second residue were modelled explictly; and finally in Mo, 
the entire molecule was rendered in full atomic detail. The ladder of mixed 
models was chosen to keep approximately fixed the number of hydrogens by 
which neighboring levels differ, without splitting a methyl group. 

The top level (M4) was simulated first, for 25, 50, 100, or 200 nsec. The 
different lengths of top-level simulation were used to generate the different 
data points in Fig. 01 Then the higher resolution models were run, as per 
the top-down protocol (see Sec. II. 3|) . attempting exchanges once per psec. 
A total of 2.5 x 10 3 exchanges attempted between each level, for a total 
trajectory length per level of 2.5 nsec. Frames were stored every 0.1 psec, 
for a total of 2.5 x 10 frames in the sample at each level below the top. 

Met- enkephalin. We next studied met-enkephalin (NH^-Phe-Gly-Gly- 
Tyr-Met-COO - ). The united atom force field was a modified version of 
OPLSUA. 53 The force field was modified so that the bond length and and 
angle bending parameters matched those of the all-atom force field, which 
improves exchangeability (or conformational overlap) of the two models. 
The sample of the top level model was constructed from two independent 
100 nsec trajectories, both started from pdb structure lplw(l st NMR model), 
generated by Langevin dynamics as implemented in TINKER v. 4. 2. 5 The 
friction coefficient was 5 psec -1 , and solvation was modelled with the GB/SA 
method. 55 The first 1 nsec of each trajectory was discarded and frames were 
stored every 10 psec for a total of 19, 800 frames in the sample. 

We then ran the next higher resolution simulation, as per the top-down 
algorithm (see Sec. I1.3[> . This model was of mixed resolution, with the Tyri 
residue represented by the OPLSAA all-atom forcefield, 56 and the remaining 
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residues described by the OPLSUA force field. Every 1 psec, a random 
configuration was pulled from the top-level (M5) trajectory, and a resolution 
exchange was attempted, with acceptance governed by Eq. Since the 
acceptance ratio for the M5 to M4 exchanges was approximately 10%, the 
average length of M4 trajectory between exchanges was 10 psec. A total 
of 10 4 ResEx moves were attempted, for a total M4 trajectory length of 10 
nsec. Frames were stored every 0.1 psec for a sample of 10 5 frames. 

This procedure was then repeated for each level shown in Fig. ^ with 
the exception that the attempt frequency of ResEx moves was adjusted 
for the acceptance ratios, so that the segments of the simulations between 
exchanges were kept approximately constant at 10 psec. Also, the total 
number of attempted exchanges was adjusted so that approximately 10 3 
successful exchanges were observed between each level, for a total trajectory 
length of 10 nsec at each level. Given that the top level is presumed to be 
well-sampled, 10 3 exchanges should sample a large number of basins. 



2 Results 

In a previous short paper, we tested the ResEx algorithm on two small 
molecules: butane and dileucine peptide. 1 It was shown that the method 
produced results in agreement with those obtained by standard simulation 
methods. For the sake of clarity, here we first demonstrate our approach 
on a two-dimensional toy model, consisting of two basins which differ only 
entropically. We also extend the method to two peptides, dileucine and met- 
enkephalin, in order to demonstrate the viability of incremental coarsening. 



2.1 Results: Two-dimensional model 

An important consideration in designing any sampling method is whether it 
will correctly account for entropy differences. We therefore designed the po- 
tential surface shown in fig. Elto compare three different sampling methods: 
a "standard" Monte Carlo simulation, the same Monte Carlo with resolution 
exchange, and the same Monte Carlo with the "dual REM" method of Lwin 
and Luo. 44 

The surface U(x,y) in fig. El is described by the function 

U(x, y) = E b (x 2 - l) 2 + — .^. U1W9 , (3) 

v ' 1 + w (tanh(x/0.1) + 1) /2 

where Eq = ksT, E b = 10 &bT is the barrier height, and w controls the 
width of the right well in the figure. Notice that the profile of the surface 
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at y = is symmetric about x = 0: U x (x) = U(x,y = 0) = Eb(x 2 — l) 2 , i.e., 
the two minima are of equal depth. The parameters were chosen so that the 
equilibrium populations of the two wells differ greatly — we used w = 500, 
so that the right well holds 95% of the population, as measured by standard 
techniques. 

For both the ResEx and the dual REM simulations, the "coarse-grained" 
potential was simply the one-dimensional potential U x , i.e., a symmetric 
double well. 

To describe the exchange moves explicitly, we denote 2D configurations 
by (x, y) and ID configurations by x. For both algorithms, an exchange 
move consists of two parts: the construction of a ID trial configuration 
(xnew) from a 2D configuration (x id, y id)-> an d vice versa: the construction 
of a 2D trial configuration (x new ,y new ) from a ID configuration (x id)- The 
construction of a ID configuration in each case is simple-the "extra" (y) 
coordinate is dropped, i.e., x new = x u- 

The only difference between the two simulations is in the construction 
of trial configurations for the 2D model from the ID model. In ResEx, the 
trial configuration is the x coordinate from the ID model, with the (old) y 
coordinate from the 2D model, i.e., x new = x id and y new = y id- I n dual 
REM, on the other hand, the trial y coordinate is chosen randomly, and 
then minimized. For the potential U(x,y), this means that y new = always, 
i.e., x new — x ici and ynew — 0. 

The ResEx simulation correctly samples the two wells, giving a popula- 
tion in the right well of 96. 4± 1.6%. The dual REM simulation, on the other 
hand, yields a population of 51.0 ± 1.6% for the right well. What causes the 
error in the dual REM simulation? The answer is that the construction of 
dual REM trial moves violates detailed balance. More specifically, the mini- 
mization of the y coordinate means that the difference in width between the 
two wells is not accounted for correctly, since in dual REM y new = always. 
Notice that the it is not the random selection of the y coordinate which 
intrinsically violates detailed balance, only the subsequent minimization. 

What is the analagous situation in molecular simulations? In this case, 
both ResEx and dual REM construct trial moves in internal coordinates- 
the coarse-grained model is built from a subset of the degrees of freedom of 
the atomic model. For example, the coarse-grained model (the x coordinate 
above) may be the backbone coordinates of a protein, and the remainder 
(the y coordinate above) may be the sidechain degrees of freedom. In dual 
REM construction of trial moves, the sidechains are minimized prior to ex- 
change, and the therefore differences in entropy between different sidechain 
conformations are neglected. In ResEx, there is no minimization prior to 
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exchange, and canonical sampling is maintained. 



2.2 Results: Incremental coarsening of dileucine. 

We previously reported on ResEx results for dileucine, demonstrating suc- 
cessful exhange and significant speedup from a direct exchange between 
all-atom and united-atom models. Here, show that dileucine may be coars- 
ened incrementally, and that (i) the correct distribution is observed for the 
all-atom model and (ii) adding additional intermediate levels of resolution 
improves efficiency. 

The additional levels boost the exchange acceptance by two orders of 
magnitude: exchanges were successful between M4 and M3 15.5% of the 
time, between M 3 and M 2 12.7% of the time, between M 2 and Mi 29.0% of 
the time, and between and Mi and Mq 44.0% of the time. By comparison, 
exchanging AA and UA dileucine in a single step is successful only 0.16% 
of the time. 1 However, we need to ask whether it is really more efficient to 
introduce additional levels of simulation in order to boost the acceptance of 
exchange moves. 

In fact, it appears to be substantially more efficient to use incremental 
coarsening rather than abrupt coarsening. The cost for a given ladder of N 
levels may be written 



where the cost of the top level is fixed, m is the fixed number of successful 
exchanges which are desired, r, is the simulation cost for an interval between 
two exchange attempts at level i, and is the acceptance rate between levels 
i and i + We have assumed that the sampling of level i demands a fixed 
number of successful resolution exchanges, consistent with the motivation 
of the top-down protocol discussed in Sec. 11.31 

Eq. Q implies that the effective exchange rate for an incremental ladder 
is a reciprocal sum of the individual rates. If we assume the r, are equal for 
all levels (which is exact for temperature exchange), then 



giving an effective rate for the 5 level dileucine ladder of 5.1%. This result 
suggests an improvement in efficiency relative to the single step ladder, where 
the rate was Q.156%. 1 



N-l 






(5) 
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In Fig. we compare the sampling of dileucine by three different sim- 
ulation protocols: standard Langevin dynamics, resolution exchange with 
two levels, and resolution exchange with five levels. Sampling is assessed 
by examining the relative populations of the a and j3 states (e -AG< W fcsT ) 
considered in Ref. 1 The convergence of this relative population measure 
requires transitions between the two states, which occur infrequently in a 
standard simulation. The five-level ladder clearly outperforms the two-level 
ladder, as we are able to generate results both more accurate and more pre- 
cise with the five-level ladder in an equal amount of CPU time. Note that 
the total simulation time required for the entire ladder, including the top 
level, is included in the ResEx data points. 

The efficacy of the ResEx approach is underscored by the fact that, at the 
top level (united atom), the sign of AG a /3 is wrong. That is, the exchange 
process corrects for a substantial bias in the coarse model. 

2.3 Results: Incremental coarsening of met-enkephalin 

Met-enkephalin is a flexible neurotransmitter which participates in immune 
responses and pain inhibition, among other roles. 57 ' 58 By virtue of its small 
size and biological interest, it often is used to test new simulation meth- 
ods 27 ' 59 ' 60 and compare existing protocols. 58 ' 61 

Using met-enkephalin, we demonstrate the efficacy of the incremental 
coarsening procedure for a ladder of decreasing resolution at constant tem- 
perature, and for a ladder of simultaneously decreasing resolution and in- 
creasing temperature. Because quantifying the quality of sampling for met- 
enkephalin is considerably more difficult than is commonly appreciated, we 
will not present a detailed efficiency analysis. More will be said on this 
second topic in the discussion. 

We employed the ResEx algorithm in a top-down framework, as sketched 
in Fig. [3 First, the top-level simulation (coarsest resolution — here, united- 
atom) was run. We then ran an exchange simulation at the next highest 
resolution — here, one residue was represented at all-atom resolution, and the 
rest of the peptide was united-atom. This procedure was continued, "de- 
coarsening" one residue at a time, until the entire peptide was represented 
at the all-atom level. Details are given in Sees. and ITU 

The incremental coarsening procedure substantially increases exchange 
acceptance. The five rates in the six-level ladder vary from 2.4% to 18%, as 
shown in Fig. For comparison, exchanging between all-atom and united- 
atom models of met-enkephalin, with no intermediate levels of resolution, 
results in an acceptance ratio of 0.09%. The acceptance ratios vary, in part, 
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according to the number of degrees of freedom by which the two levels differ. 

For met-enkephalin, the principal results are the acceptance rates shown 
in Fig. ^ which are significant for several reasons. First, they demonstrate 
the first implementation of the incremental coarsening approach in a com- 
plex peptide. Second, because they are well within the practical range of 
the top-down protocol — see Sec. 11.31 and the Discussion — they indicate that 
the ResEx algorithm could prove important for larger peptides. Lastly, by 
comparing the effective exchange rate suggested by Eq. r e fj = 5.2%, 
with the rate of 0.09% for direct exchanges between united- and all-atom 
models, one sees that a substantial speedup has been achieved. Of course, 
the magnitude of the improvement is merely suggestive — without a rigorous 
quantification of the sampling quality, there can be no rigorous comparison 
of efficiency. Such a quantification is beyond the scope of this work, as noted 
in the Discussion. 

It is useful to understand the intuitive reason behind the advantage of 
incremental coarsening. If one writes the acceptance criteria (J2) and (|1.3j) in 
the form min[l, e~ e ], then for exchanges between models of greatly differing 
resolution, one typically finds the dimensionless "energy" is large, i.e, e>l. 
It seems to be roughly true that this energy is proportional to the differ- 
ence in the number of degrees of freedom in the models being exchanged. 
However, if the change is made incrementally using many models Mj, then 
between levels i and i+1 there is a relatively small cost Aq, with Y^i ~ e - 
It is clear that with enough increments, one can achieve Ae^ <C 1, and thus 
create a high likelihood for exchange since the corresponding Boltzmann 
factors are much larger: rj ~ e _A<Ei 3> e~ e . This is exactly what is embod- 
ied in Eq. Q). The trade-off is that one pays the cost for simulating the 
additional intermediate levels. However, as has been stressed in Sec. 11.31 
the intermediate-level simulations are very short compared to the top level. 
In the present context, for instance, the top level met-enkephalin trajectory 
is 198 nsec, while all other levels were simulated for only 10 nsec. The net 
savings can be quite substantial, especially considering that good sampling 
is achieved by increasing the number of exchanges. 

While we cannot yet rigorously measure sampling quality, we can show 
that the results obtained with ResEx are consistent with those obtained by 
standard methods, by comparing Ramachandran histograms (Fig. EJ) from 
the ResEx simulation, to those obtained by standard simulation (990 nsec 
of simulation with the Mq parameters). Overall, the agreement between the 
ResEx simulation and the 990 nsec conventional simulation is quite good. 
However, a careful comparison reveals a region on the Phe4 plot, labelled 
"A", which is noticeably under-sampled by the ResEx simulation, as com- 
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pared to the 990 nsec Langevin dynamics trajectory. The explanation is 
provided by an inspection of the Phe4 histogram of the united-atom simu- 
lation: region "A" was not sampled by the top-level simulation. The failure 
points to a potential weakness of the ResEx (or any exchange) method — 
regions which are not sampled by the top level will be difficult to sample 
in any of the other levels. This is a specific instance of a general problem 
that occurs whenever auxiliary distributions are used to enhance sampling 
of some "distribution of interest," namely the need to balance overlap with 
wider sampling via the auxiliary distribution. 51 In other words, it is a failure 
of the top-level simulation rather than the algorithm. 

Interestingly, Fig. El also presents two counterexmples to the forgoing 
discussion. Regions "B" of the Gb/3 and "C" of the Mets plots were both 
well-sampled by the ResEx simulation, despite being infrequently visited 
by the top-level. That is, the ResEx acceptance criterion (^Q) correctly "re- 
weights" the conformation space of the all-atom model by allowing normal 
dynamics to continue when appropriate. Nevertheless, we are in the process 
of experimenting with other "schedules" (combinations of attempt frequency 
and number of exchange attempts) to balance the normal and the exchange 
dynamics. 

Ideally, the coarse model distribution would have better overlap with 
the high-resolution distribution, and the balance could be adjusted to favor 
exchanges over normal dynamics. This would allow the same quality of 
sampling with less simulation at each level below the top. In the long term, 
we hope to design coarse models constructed to not eliminate any regions 
of configuration space in more detailed models. 

2.4 Resolution exchange with tempering 

We have also explored the possibilty of combining resolution exchange with 
parallel tempering, so that the sampling of the reduced models is improved 
both by the reduction in resolution and by increased temperatures. In a 
standard parallel tempering simulation, the temperatures are roughly ex- 
ponentially distributed, in order that the conformational overlap between 
neighboring temperatures is constant over the ladder. However, there is no 
simple relationship between the change in resolution and the acceptance of 
resolution exchanges. Some care must therefore be taken with the assign- 
ment of the temperature ladder. 

We began with the ladder of models in fig. ^ The acceptance ratios 
give an idea of the temperature gap which may be tolerated between two 
levels — a higher acceptance ratio will tolerate a larger jump in temperature. 
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However, compared to standard parallel tempering simulations, it may 
seem that the acceptance ratios are already too low to accomodate tem- 
pering in addition to resolution exchange. After all, we may expect that 
any difference in temperature will lower the acceptance of exchange moves. 
In this regard, the top-down approach has an important advantage over a 
parallel implementation. Since exchange attempts are "free" (no commmu- 
nication between processors is required), they may be attempted much more 
frequently, and lower accptance ratios may be tolerated. 18 Indeed, in our 
original study of dileucine peptide with top-down resolution exchange, the 
acceptance ratio was much less than 1%. 1 See also Sec. 11.31 

The ladder combining temperature and resolution is shown in Fig. H3 The 
temperature gaps were chosen by trial and error, aiming for an acceptance 
of attempted exchanges of a few percent between neighboring levels. Based 
upon this restriction, the top-level simulation was run at a temperature 
of 700 K, which is comparable to previously published parallel tempering 
studies of met-enkephalin. 27 ' 36 We should expect, however, that fixed CPU 
cost sampling should be improved relative to ordinary replica exchange, by 
virtue of the reduction in resolution. 

The reduction in resolution confers an additional benefit when combined 
with tempering. Since the overlap between neighboring levels in a paral- 
lel tempering simulation scales like (number of DoF) 1 / 2 , reducing resolution 
allows the temperature gaps to increase as the resolution is reduced. The 
overlap between neighboring levels in a combined resolution/tempering lad- 
der is thus controlled both by the change in resolution, and the change in 
temperature, with the two effects compensating one another in an unknown 
way. Indeed, we observed one puzzling case in our search for an appropriate 
resolution/temperature ladder. In one ladder (data not shown), exchange 
between levels M2 and M3 was successful about 7% of the time when both 
were at 298 K, while exchange occurred approximately 11% of the time 
when M2 was thermostatted to 305 K, and M3 to 320 K. We have not ex- 
plained this result — though it should be remembered that different models 
have different landscapes, and therefore temperatures may not be directly 
compared. 

3 Concluding Discussion 

We have extended our resolution exchange (ResEx) method 1 using an incre- 
mental coarsening procedure for implicitly solvated peptides. After carefully 
testing the approach in the two-residue dileucine peptide, we applied it suc- 
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cessfully to the five-residue met-enkephalin. Incremental coarsening allows 
tuning of the conformational overlap between models of differing resolution, 
and therefore makes practical simulations which would otherwise be ham- 
pered by poor acceptance of exchange moves. We have also demonstrated 
that resolution exchange is naturally combined with parallel tempering, so 
that the reduced resolution models may be aided in their sampling of con- 
formations by elevated temperatures. 

Ramachandran histograms demonstrate that, for the most part, ResEx 
simulation is consistent with standard simulation techniques. In one case, 
however, they reveal a weakness of our method — a top-level simulation which 
eliminates important regions of conformation space will result in poor sam- 
pling at the bottom level. This weakness is shared by any simulation which 
relies upon auxiliary ensembles to sample among major sub-basins. In the 
future we hope to eliminate this problem by more careful construction of 
reduced models. 

Of course, we hope to treat still larger molecules with the ResEx method. 
Since it is essential that the top-level be well-sampled, the treatment of larger 
molecules will require yet coarser top-level simulation. This will likely re- 
quire incremental coarsening from the united-atom level to a model with 
one or two beads per residue. Suitable models are under development. It 
appears that the ResEx approach cannot be applied easily to explicitly sol- 
vated systems; however, given the difficulty and importance of sampling 
implicitly solvated systems, ResEx may prove very valuable for biomolecu- 
lar simulation. 

We have also developed an alternative rigorous algorithm which permits 
the use of coarse top-level simulations to generate atomically detailed canoni- 
cal samples. It is essentially a "decorating" procedure, and it eliminates the 
potential issue of correlations between coarse coordinates 3> and detailed 
coordinates x, which could reduce acceptance rates in resolution exchange. 
Specifically, after generating a low-resolution sample distributed according 
to 7T£,(<fr), one can independently sample detailed coordinates x according 
to an arbitrary distribution n x (x). (For example, ir x could be based on har- 
monic terms in the full forcefield.) Full configurations are thus generated 
according to the simple product 717, (^^(x) and may be re- weighted to gen- 
erate a fully detailed, high-resolution, distribution 7r#(3>,x) using standard 
methods. 62 In the long term, the decorating approach may prove useful 
for adding explicit solvent. It may also be implemented in an incremental 
fashion. 

An "auxiliary" question which remains to be carefully addressed is the 
quantification of sampling efficiency. There are numerous proposals for judg- 
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ing whether a simulation is converged — some are based on principal com- 
ponents, 63 others on energy-based ergodic measures, 64 and our own work 
in progress uses structural histograms. 65 Which one provides an appropri- 
ate measure depends on what properties are of interest. For applications 
which depend on the relative populations of various conformations, such as 
calculation of binding affinities for small molecules, a measure which de- 
pends directly on the conformational distribution is needed. Such a method 
is under development-for now we only mention that structural histograms 
provide a much more sensitive signal of non-convergence than energy-based 
methods. 65 
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Figure 1: Two different ladders used to exchange all-atom with united-atom 
met-enkephalin. Residues are depicted with ovals — open corresponds to an 
all atom representation, filled to united atom. The ratios of successful to 
attempted exchanges between each level are indicated by the percentages. 
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Figure 2: Schematic representation of top-down exchange. Thick horizontal 
lines are simulation trajectories (labelled "Mj" for model "i") and arrows 
represent exchanges. The Mj may be differ in resolution, temperature, or 
both. Notice that the top level simulation may be considerably longer than 
the others. 
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Figure 4: Comparison of different ResEx protocols for dileucine. Plotted 
are free energy difference estimates between the a and (5 states as a func- 
tion of total CPU cost. The dashed lines are individual runs generated by 
standard Langevin dynamics (no exchange), and the solid horizontal line 
is the avgerage of 4 independent 150 nsec Langevin dynamics simulations. 
The solid circles are ResEx results from the two level ladder from Ref., 1 and 
the diamonds are the ResEx results from the five level ladder, averaged in 
each case over 8 independent runs. The error bars give the range of the 
8 independent runs. The ResEx data points are displaced from the origin 
to accurately reflect the time invested in generating the top level and all 
intermediate level distributions. The exchange schedule for ResEx has not 
been optimized. 
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Figure 5: Ramachandran histograms for met-enkephalin. The left column 
is a 990 nsec Langevin dynamics simulation at all-atom resolution, without 
resolution exchange; the middle column is the all-atom level (Mo) from res- 
olution exchange as described in the text; the right column is the top-level 
united-atom simulation (level M5) used for the resolution exchange simula- 
tion shown in the middle column. Note that since the peptide is unblocked, 
there are only 4 pairs of <p — ip dihedrals. Res-ex fails to "find" one region 
(labelled "A") not present in the top-level simulation, but finds two others 
(labelled "B" and "C"). 
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Figure 6: Ladder combining exchange between all-atom and united-atom 
met-enkephalin with tempering of reduced resolution simulations. Residues 
are depicted with ovals — open corresponds to an all atom representation, 
filled to united atom. The ratios of successful to attempted exchanges be- 
tween each level are indicated by the percentages. The temperature of each 
level is indicated on the right. 
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